Lossless Compression for Satellite Packet Networks Using the YK Algorithm
نویسندگان
چکیده
The YK algorithm [1] is a powerful lossless data compression algorithm built on a grammar-based framework, which optimally combines entropy coding and string matching capabilities to achieve excellent compression. Experimental results demonstrate that it outperforms most other known lossless codes on essentially all types and sizes of data. Our paper proposes the use of YK algorithm for data compression in communication networks that allow compression on pipelined packet streams. Lossless data compression algorithms, including the YK algorithm, are much more effficient when they are processing relatively large amounts of similar data. High compression is not obtained immediately after a grammar is initialized, but when the compression algorithm has become tuned to the data at hand after working on a few early instances of data. One typical example of packetized data, commonly occurring over the Internet, is a sequence of HTTP requests. Although a typical HTTP request weighs in from between 200 and 400 bytes the bulk of the bytes are identical. High compression ratios are achievable when knowledge of early requests can be employed on subsequent requests. Our paper addresses the problem of compressing packetized data, under the constraint that each packet much be encoded (and decoded) independently of any subsequent packet, but dependence on previous packets is allowed to enhance compression efficiency. The sequential nature of the YK algorithm allows it to continue building the grammar across different packets of data on the same pipeline, without the need for reinitializing the grammar at the beginning of each unit of data. However, the YK algorithm is not applicable in its original form. This paper proposes two packetized versions of the YK algorithm a greedy algorithm, and a non-greedy algorithm. Surprisingly, the more intuitive greedy algorithm leads to poor compression in applications where packetization occurs at natural data boundaries, such as in the case of HTTP requests where one packet corresponds to one request (natural packetization). However, the greedy algorithm performs slightly better in applications where packetization is performed by splitting a large data unit into multiple packets, such in the case of HTML pages (forced packetization). Simulation results using packetized YK algorithms on three data sets are tabulated below. The last two columns provide results using the original (non-packetized) YK algorithm, assuming the algorithm is allowed to run when the complete data set is available as a single file. The performance of the nongreedy packetized YK algorithm is close to that of this ideal but non-feasible algorithm.
منابع مشابه
Architecture for Efficient Implementation of the YK Lossless Data Compression Algorithm
The YK (Yang-Kieffer) algorithm [1,2] is a sequential lossless data compression built upon an efficient greedy grammar transform. This grammar transform constructs a sequence of irreducible context-free grammars from which the original data sequence can be recovered incrementally. The transform sequentially parses the original data into nonoverlapping variable-length phrases, and updates the gr...
متن کاملApplications of YK Algorithm to the Internet Transmission of Web-Data: Implementation Issues and Modifications
Recently, Yang and Kieffer’ proposed a novel lossless grammar-based data compression algorithm, called the YK algorithm, in which a greedy sequential grammar transform is applied to the original data to construct an irreducible context free grammar, which is encoded indirectly by using an arithmetic coder. The YK algorithm has been shown to be universal for the class of stationary, ergodic sour...
متن کاملJOINT SOURCE AND CHANNEL CODING FOR INTERNET IMAGETRANSMISSIONGeo rey
Images are usually transmitted across the Internet using a lossless protocol such as TCP/IP. Lossless protocols require retransmission of lost packets, which substantially increases transmission time. We introduce a fast lossy Internet image transmission scheme (FLIIT) for compressed images which uses forward error correction to eliminate retransmission delays. FLIIT couples an embedded quantiz...
متن کاملJOINT SOURCE AND CHANNEL CODING FOR INTERNET IMAGETRANSMISSIONGeo
Images are usually transmitted across the Internet using a lossless protocol such as TCP/IP. Lossless protocols require retransmission of lost packets, which substantially increases transmission time. We introduce a fast lossy Internet image transmission scheme (FLIIT) for compressed images which uses forward error correction to eliminate retransmission delays. FLIIT couples an embedded quantiz...
متن کاملHybrid Algorithm for Lossless Image Compression using Simple Selective Scan order with Bit Plane Slicing
Problem statement: Identifying the new lossless image compression algorithm for high performance applications like medical and satellite imaging; a high quality lossless image is most important when reproduction which leads to classify the data for decision making. Approach: A new lossless hybrid algorithm based on simple selective scan order with Bit Plane Slicing method is presented for lossl...
متن کامل